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Abstract. Deconvolution of the telescope Point Spread Function (PSF) is necessary for even moderate dynamic range imaging 
with interferometric telescopes. The process of deconvolution can be treated as a search for a model image such that the residual 
image is consistent with the noise model. For any search algorithm, a parameterized function representing the model such that 
it fundamentally separates signal from noise will give optimal results. In this paper, the first in a series of forthcoming papers, 
we argue that in general, spatial correlation length (a measure of the scale of emission) is a stronger separator of the signal from 
the noise, compared to the strength of the signal alone. Consequently scale sensitive deconvolution algorithms result into more 
noise-like residuals. We present a scale-sensitive deconvolution algorithm for radio interferometric images, which models the 
image as a collection of Adaptive Scale Pixels (Asp). Some attempts at optimizing the runtime performance are also presented. 
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1. Introduction 

Interferometric telescopes employ the van Cittert-Zernike the- 
orem to synthesize ape rtures much larger tha n the size of 
the individual antennas dThompson et aljr200ll) . Such instru- 
ments measure the source coherence function, which is the 
Fourier transform of the sky brightness distribution. However 
the source coherence function is measured at discrete points in 
the Fourier space resulting into a point spread function (PSF) 
which has significant wide-spread sidelobes. The presence of 
these sidelobes limits the dynamic range of raw images made 
with such telescopes. For even moderate dynamic range imag- 
ing, deconvolution of the PSF is necessary. 

The measurement equation describing an interferometer 
can be written as: 



y = A/" + AN 



(1) 



where V and are the measured visibility and independent 
random noise vectors respectively, and /" is the true image. 
The measurement matrix A represents the linear transform from 
the image to the visibility domain. For perfectly calibrated co- 
planar Fourier synthesis devices like imaging interferometric 
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telescopes, A is the rectangular observation matrix, the non- 
zero elements of which are the sines and cosines corresponding 
to the measured fourier components. In practice, due to incom- 
plete sampling of the visibility domain, A in general is a singu- 
lar non-square matrix, excluding the use of linear methods to 
invert the above equation. Non-linear deconvolution methods 
are therefore used to estimate /". In the usual terminology used 
in the literature, A^V is the dirty image (/'') and A^A = B is 
the beam matrix (the super-script T implies a transpose). B is 
a toeplitz matrix of the dirty beam or the point spread function 
and BI" represents the convolution of the true image with the 
PSF. Note that the resultant noise vector in the image domain 
(A^AA^) is also convolved with the PSF and the pixel-to-pixel 
noise in the image is not independent. 

The process of deconvolution can be described as a search 
for a model image which solves the normal equation 

A^y = A^AI'^ + A'^AN (2) 

A non-linear minimization scheme can be set up which it- 
eratively minimizes the objective function until the residuals 
y - A/** are noise-like. 

The dimensionality and the nature of the search space is 
governed by the parameterization of l'^. Scale insensitive de- 
convolution algorithms like CLEAN (■Hogbom.1974) and its 
variants model the image as 



Xk,y-yk) 



(3) 



2 



S. Bhatnagar and T.J. Cornwell: Scale sensitive deconvolution of interferometric images 



which is a collection of delta functions of amplitude Fk at each 
pixel location. An N x M pixel clean-box corresponds to a 
N X M dimensional search space. The CLEAN algorithm it- 
eratively estimates the Fi.s by taking a fixed sized step along 
the axis of highest d erivative (the peak in the residual image) 
(see ISch warz 19781 for a more complete description). This 
is done by updating the residual image at each iteration as 
if - /* J - gB l^max/* jj where g is called the loop-gain (it 
controls the step-size). This implicitly assumes an orth ogonal 
search space of constant curvature. Variants of CLEAN ( iClarkl 
Il980l) operate in two cycles called major and minor cycles. 
The minor cycle uses an approximate PSF and builds a shal- 
low model image. The accuracy lost in the minor cycle due to 
the use of approximate PSF is recovered in the more expensive 
major cycle where the residual image is computed at full ac- 
curacy as It can therefore be considered to be 
a steepest descent algorithm to minimize the objective func- 
tion = [y - A/"]^ W [y - A/^]. The dimensionality of the 
search space can be constrained by the user defined clean-box 
which usually remains fixed after initial specification. The step 
size is also a user defined parameter, typically < 0.2. The stop- 
ping criterion is a combination of the maximum number of iter- 
ations and the magnitude of the maximum residual. Typically, 
the algorithm is stopped (often by user intervention or by ad- 
justing the maximum number of iterations) when maximum 
residuals are comparable to the estimated noise (another user 
defined parameter). The regularization to avoid the problem of 
over fitting in such an unconstrained minimization is explicitly 
imposed via forcing a finite number of iterations. 

Statistical image reconstruction algorithms like Maximum 
Entropy Method (MEM) ( Narayan & Nityananda 1986, and 
references therein) on the other hand set up a formal con- 
strained minimization algorithm which minimizes the objective 
function /(/**, A) - H - Ax^ where H is the entropy function, 
/I is a Lagrange multiplier and the model image is parameter- 
ized as in Eq. |3l The function H is derived from a physically 
meaningful prior distribution and imposes various desirable 
constraints like smoothness. An extra term, sometimes with an- 
other undetermined Lagrange multiplier is also used to impose 
the positivity constraint. The Entropy function acts as a reg- 
ularizer, biasing the solution towards the supplied prior image 
while pulls the solution towards the best fit to the data. From 
a set of images all of which satisfy the normal equation (Eq.|2}, 
the MEM image corresponds to the mode of the a-posterior dis- 
tribution. The step size computation for the minimization of / 
requires an evaluation of the inverse of the Hessian matrix. For 
images with a large number of pixels with significant emission, 
the size of the Hessian matrix can be large and inverting it typ- 
ically requ ires SVD like algor i thms w hich are computationally 
expensive. ICornwell & Evansl(ll985h devised a fast iterative al- 
gorithm by approximating the Hessian by a diagonal matrix to 
gain in speed. 

Both approaches however use the parameterization in Eq.|3l 
which we refer to as a scale-less model. The image is decom- 
posed into a set of delta functions and the search space is as- 
sumed to be orthogonal. 



Coupling of the pixels in the dirty image comes from two 
sources. The pixels of the underlying true image are inherently 
not independent in the presence of extended emission. The sec- 
ond source of coupling comes from the PSF. As mentioned ear- 
lier, the main lobe of the PSF has a finite width (proportional 
to the diffraction limit of the synthesized aperture) and signif- 
icant wide-spread sidelobes. The former kind of coupling im- 
plies that the search space is potentially non-orthogonal unless 
the true sky is composed only of clearly unresolved sources. 
The latter however implies a coupling of even widely separated 
unresolved sources via the sidelobes of the PSF. Ignoring the 
coupling due to the PSF possibly only results into slower con- 
vergence. However, ignoring the inherent coupling of pixels for 
extended emission allows more degrees of freedom (DOE) than 
required, which results into the breaking up of the extended 
emission leaving low level correlated residuals. The peak resid- 
uals are comparable to the estimated noise but are correlated 
at scales larger than the resolution element (the synthesized 
beam). In the absence of any scale information in Eq.|3] such 
residual emission cannot be recovered. 

It is easy to see that an optimal image reconstruction al- 
gorithm will also use the minimum number of DOF to repre- 
sent the image. However in practice, it is almost impossible 
to determine this minimum number, and hence also difficult to 
design an algorithm which will achieve such a representation 
Practically therefore, the goal is to seek a solution with least 
complexity. From the point of view of imaging weak extended 
emission, the important improvement in the deconvolution al- 
gorithm is therefore to decompose the true sky image in a scale 
sensitive basis. This implies a significant increase in algorithm 
complexity and computational cost. In this paper we describe a 
scale sensitive deconvolution algorithm for interferometric im- 
ages and attempts at improving its performance. 

2. Fundamental separation of signal and noise 

Equation|2lcan be written in terms of the dirty and true images 



BF + r where F 



BN 



(4) 



The distinction between the true and the dirty image can then 
be stated as the latter being equal to the true image corrupted 
in a deterministic way by the sidelobes of the PSF and in a 
non-deterministic way by the additive noise image (l'^). These 
corruptions are represented by the two terms in Eq. |3 Prior 
knowledge of the deterministic pattern of the PSF is explicitly 
used by deconvolution algorithms which attempt to recover /" 
from F given B. This approach works well where the first term 
in the above equation dominates. For weak large scale emis- 
sion, the effect of the second term is significant. It is easy to see 
that without explicitly incorporating prior information which 
fundamentally separates the two terms, such features in /" can- 
not be recovered. Indeed, it is well known that the residuals 
of scale-less deconvolution are correlated with the large scale 
features and large scale features are in general poorly recon- 
structed. 

The term involving in Eq.[I]can be shown to be a gaus- 
sian random process («Thompson et al..20Ql.) . with the pixel-to- 
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pixel noise being independent (zero correlation length). Ideally, 
in the absence of the convolution with the PSF, this would 
result into the noise image with a auto-correlation func- 
tion (ACF) of zero width. However the main lobe of the PSF 
(the synthesized beam or the nominal resolution element) has a 
width larger than the image pixel size. Convolution by the PSF 
therefore results into smoothing of all pixel-to-pixel variations 
at scales smaller than the synthesized beam making the auto- 
correlation width of of the order of the resolution element. 
This implies that the largest correlated feature in is of the or- 
der of the resolution element. On the other hand any physically 
plausible model image representing an observation of /" will 
have minimum correlation length of the order of the resolution 
element of the imaging device. The range of allowed scales of 
emission (correlation length) therefore provides a fundamen- 
tal handle for separating from /''. Correlated emission in I"^ 
at scales much larger than the resolution element must come 
from real extended emission. Correlation length alone for such 
emission fundamentally separates it from the noise. For unre- 
solved features (scale comparable to the resolution element) in 
the strength of the emission provides a similar handle (fea- 
tures at scales smaller than, or equal to the resolution element 
and weaker than the estimated noise RMS are indistinguishable 
from /^). The ACF of l'^ , could have low level wings at scales 
marginally larger than the main lobe (due to the sidelobes of 
the PSF). However, for observations with good uv-coverage', 
the sidelobes are small and the resulting wings in the ACF will 
be at a very low level. For real emission at scales in this regime 
where the fundamental scales of the noise in the image domain 
and the real emission overlap, the strength of emission plays 
a crucial role along with the scale of emission in separating 
the signal from the noise (correlated emission at these scales, 
which is also significantly stronger than the expected noise is 
more likely to be due to the real emission). A combination of 
scale and the strength of emission therefore fundamentally sep- 
arates noise from the signal. Scale sensitive deconvolution al- 
gorithms incorporate this information explicitly in the decon- 
volution process and hence leave more noise-Uke residuals. 

3. Overview of scale sensitive methods 

Scale sensitive (multi-scale) methods seek to represent the in- 
herent coupling of the pixels in the true image by decomposing 
it in a basis which locally minimizes the complexity of repre- 
sentation. Since the use of minimum DOF also corresponds to 
minimum complexity, in practice the problem reduces to find- 
ing the largest locally best fit scale in the image. Clearly, the 
efficiency with which this can be done critically depends on the 
degree of coupling between pixels due to the PSF. The larger 
the scale of this coupling, the more complex is the structure 
of the covariance matrix. Consequently, the deconvolution al- 
gorithms become more complex and computationally expen- 
sive. Efficient scale sensitive deconvolution algorithms for im- 
ages from filled aperture instruments with additive uncorre- 
lated noise now exist. Efficient algorithms for the deconvolu- 

' Good uv-coverage is a pre-requisite for high fidehty and dynamic 
range imaging of extended emission. 



tion of interferometric images, where the PSF has significant 
large scale sidelobes and the noise in the image domain is cor- 
related, pose a bigger challenge in comparison to filled aperture 
telescopes. 

3.1. The Pixon method: for filled aperture instruments 

The Pixon method jPuetter & Pinal fl994l) iteratively decom- 
poses the true image as a collection of locally best-fit kernels 
(usually gaussians). A kernel is used for every pixel in the im- 
age and the parameters of these kernels are determined by a 
localized fitting to the data using a user defined goodness of 
fit (GOF) criteria. A kernel is allowed to be as wide as pos- 
sible while simultaneously satisfying the GOF locally. For the 
case of filled aperture telescopes, the PSF has a limited sup- 
port and along with the local kernel, provides a limited foot- 
print on the data (the raw image) making the concept of "local" 
well defined. As one can imagine, as the decomposition pro- 
ceeds, many kernels will cease to be significant (corresponding 
to pixels which are better represented as part of a larger kernel) . 
A patented fast Pixon algorithm exists (|Puetter& YahiJl999l) . 
the details of which are unfortunately not known due to patent 
restrictions. Since the criteria for the acceptance of the kernels 
is enforced only locally, the method can deal with non-uniform 
noise across the image and recovers low level large scale fea- 
tures effectively. 

The assumptions of a finite support PSF and additive in- 
dependent noise in the images are central to this method. 
Therefore, despite its impressive successes with images from a 
wide variety of filled aperture imaging devices, it is not suited 
for interferometric imaging where both these assumptions are 
grossly invalid. 

3.2. The Multi-scale Clean: for interferometric 
instruments 

The Multi-scale Clea n (MS-Clean) method 
jHoldawav & Cornwelll [2004) is motivated by the Pixon 
approach and the usual CLEAN algorithm. The update 
direction at each iteration is given by the residual image 
which involves a convolution of the current model image 
with the PSF. In general, because of the complex structure 
of the PSF, this convolution has to be numerically computed. 
The normal CLEAN algorithm gains in performance by 
modeling the emission as in Eq. |3] where the convolution 
with the PSF reduces to a scale-shift-and-add (of the PSF) 
operation. MS-Clean retains this scale-shift-and-add nature 
of the algorithm by modeling the emission as a collection 
of symmetric gaussians at a few scales. The convolution of 
these components with the PSF is pre-computed. Versions of 
the current residual image smoothed by these gaussians are 
also maintained. At each iteration, a global peak among all 
these enumerated scales is searched and the version of the 
pre-computed convolved PSF at the scale at which the peak 
was found is subtracted from the residual images at all scales. 
A gaussian of the scale at which the peak was found is added 
to the list of components. 
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This algorithm better recovers the large scale emission than 
does the scale-less CLEAN algorithm. However the scale sizes 
are restricted to the few enumerated scales. Secondly, since the 
PSF at various scales are pre-computed, non-symmetric com- 
ponents instead of circularly symmetric gaussians are expen- 
sive to use. As a result, non-symmetric features are broken up 
into a series of smaller scale components. This leads to the 
same problem of breaking of structure - only the error is at 
lower spatial frequencies. Also, since the removal of compo- 
nents at each successive iteration is decoupled from all previ- 
ous components, errors in earlier iterations can only be com- 
pensated by adding more components such that the errors are 
corrected. In the space of the hyper-parameters (the enumerated 
scales), it effectively retains the assumption of an orthogonal 
search space (diagonal approximation of the Hessian). 

4. The Asp image reconstruction algorithm 

The general problem of scale-sensitive deconvolution is that of 
a function optimization in a high dimensional, non-orthogonal 
space. The curvature and dimensionality of the space is largely 
determined by the parameterization of and the extent of the 
PSF. The Pixon method exploits the locality of the effects of 
the PSF to limit the dimensionality of the search space. On the 
other hand, MS -Clean explicitly limits the dimensionality of 
the space by decomposing into a fixed set of few scales. 
This fixed set is determined before hand which remains un- 
changed from iteration to iteration and no other scale other than 
those in this set is admissible. The Adaptive Scale Pixel (Asp) 
decomposition method estimates the best fit Asp at the location 
of the peak in the residual image at each iteration. Due to the in- 
herent coupling of pixels in the true image as well as due to the 
extent of the PSF, typically only a sub-set of the Aspen change 
significantly at each iteration. We refer to this sub-set as the 
"active-set". This active-set of Aspen is determined at each it- 
eration. The number of Aspen in this set determines the dimen- 
sionality of the search space at each iteration. However, since 
the members of the active-set are determined on-the-fly, the set 
of scales used at each iteration potentially changes from iter- 
ation to iteration as well as all possible scales are admissible. 
This is in contrast with the MS -Clean where only the selected 
set of scales are allowed and the set of scales remain fixed for 
all iteration. This effectively relaxes the MS-Clean assumption 
of an orthogonal space (i.e., the Aspen estimated in earlier iter- 
ations are subject to change in later iterations) and allows errors 
in earlier iterations to be compensated, to the extent possible, 
by adjusting the current set of active Aspen. This potentially 
also reduces the total number of components required. 

The general Asp decomposition of the image is expressed 

as: 

/^ = ^i'(/7,) (5) 

k 

where = {Amplitude, Location, S cale}. A simple functional 
form for P can be a circularly symmetric gaussian or any other 
more appropriate form. The best fit Aspen are found by mini- 
mizing the objective function 

^[V-AI^fv^[V-AI^] (6) 



where / is the current model image and W is the weights 
matrix. The update direction computation requires evaluation 
of the variation ofx^ with respect to p^s given by 

^ = _2 where B/^ (7) 

dpk ^ ^ dpt 

The parameters corresponding to the location, amplitude 
and scale are estimated at each iteration by searching for a peak 
in the residual images smoothed to a few scales. In practice, 
was found to be weakly dependent on the change in the posi- 
tion of the Aspen. The location of the Aspen is therefore kept 
fixed at the location of the peak, while the amplitude and scale 
parameters are adjusted by fitting. A new Asp is added at each 
iteration unless one of the termination criteria is satisfied. The 
various steps are therefore: 

1. Set/* 

2. At the «''' iteration, smooth by a gaussian beam at a few 
scales from the resolution element to a few times the reso- 
lution element. 

3. Search for a global peak (Fk) among these smoothed resid- 
ual images. Define a new Asp with amplitude cen- 
tered at the location of the peak {xic,yk) with scale 0-^. Add 
it to the current set of Aspen {P]„ = {P]n-\ U {A^}. 

4. Find the best-fit set {i'ln. 

5. Compute 

6. Update the residual image as = — BI^. 

7. Goto step 13 unless the termination criteria is met or the 
residuals are noise-like. 

The computation of the residual image, IS re- 

quired for the computation of each step-size in the minimiza- 
tion stepE] This involves two FFTs, one gridding and one de- 
gridding operation. For an NxN sized image, the cost of FFTs 
is of the order 2N^ log(N). The cost of gridding and de-gridding 
for a A^,. visibility database is dominated by the C>(A^,,) disk I/O. 
For typical sized visibility databases, the gridding costs dom- 
inate over the cost of FFTs. A less accurate residual compu- 
tation can be done by using a pre-gridded version of the vis- 
ibilities (this will be less accurate due to the use of gridded 
visibilites rather than the raw observed visibilites). For an Asp 
with an analytical expression for its Fourier transform, the cost 
of the residual image computation scales roughly as the cost 
of FFTs. This form of residual computation can be considered 
as the minor cycle of the Asp-Clean algorithm. Periodically, the 
residual computation is done at full accuracy. This computation 
of the residuals at full accuracy and the freedom to adjust the 
Aspen determined in earlier iterations ensures that the use of 
gridded visibilities in the minor cycle has insignificant impact 
on the final result. 

5. Examples 

The ability of the algorithm to build a model of the true image 
using minimum DOF was tested by simulating an image com- 
posed of components using the same form of Asp as is used in 
the deconvolution process. It is clear that an image consisting 
of well separated Asp components will pass such a test (such 
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Fig. 1. This figure shows an example using simulated image composed of two components with significantly different but over- 
lapping scales (left). A dirty image (center) was formed by the convolution of this model image with a typical interferometric 
PSF and deconvolved using the Asp-Clean algorithm. The Asp model image is shown in the right panel. Only two significant 
components were required for the residuals to be noise-like. 



a case will be equivalent to a test case for the normal CLEAN 
algorithm with only isolated unresolved sources). Therefore, a 
test image with an overlapping set of components at different 
scales (shown in the left panel of Fig.Q, was convolved with 
a typical interferometric PSF to form the dirty image (shown 
in the center panel of Fig. and a noise image of type /'^ 
(RMS noise of ImJy) was added. This image was decon- 
volved to generate the model image shown in the right panel 
of Fig. [0 Only two Asp components were required to achieve 
convergence and the scale of these two components was au- 
tomatically detected. In contrast, MS-Clean would need more 
components, unless of course the selected set of scales included 
precisly the two scales in the image. This demonstrates that un- 
der ideal conditions, where the true image is composed of only 
Asp shaped components, the algorithm does not introduce extra 
DOF. Although, in general where the true image is not strictly 
composed of only Asp shapes, it is impossible to design an al- 
gorithm which decomposes the image using minimum DOF, 
one can be optimistic that this algorithm will use fewer DOF 
compared to other algorithms. In practice, it is indeed observed 
that Asp-Clean uses about an order of magnitude fewer DOF 
than MS-Clean, which in turn uses few orders of magnitude 
fewer DOF than scale-less decomposition. Note that neither of 
these scale sensitive algorithms attempt to combine the com- 
ponents, e.g. construct a single component out of components 
located at the same pixel and of similar scales, to reduce the 
final number of components. 

The top left panel of Fig. |2] shows the "M31" image used 
as the test image in our simulation. This was convolved with 
the PSF to generate the simulated dirty image shown in the 
top right panel. The resolution in the original image was ^ 2" 
while the synthesized beam for the PSF was x 2.5" in size. 
Asp deconvolution resulted in the Asp model image shown in 
the lower left panel of Fig.|2] The image is composed of a num- 
ber of Aspen with the smallest being of the size of a single pixel 
(the bright "dots" in the image). Since the resolution element 



(the synthesized beam) is larger than the pixel size, this image 
needs to be smooth to the scale of the resolution element, which 
is shown in the lower right panel of Fig. |2l However, even 
without this smoothing operation, the Asp image is quite good 
showing all the morphological details of the original image. 
The emission, particularly the low level extended emission, is 
not broken up into small scales (as is in the case of scale-less 
deconvolution) and features with strong compact emission sur- 
rounded by extended emission are also properly reconstructed. 
The lower right panel shows the model image convolved with 
the estimated resolution element. Fig.|3shows the residual im- 
ages for the standard Cotton-Schwab Clean (CS-Clean), MS- 
Clean and the Asp-Clean algorithms. While the residuals for 
the CS-Clean and the MS-Clean are correlated with the large 
scale emission in the image, the Asp-Clean residuals are statis- 
tically consistent with the noise and have no significant corre- 
lated features at scales larger than the resolution element. 

6. Acceleration methods 

For an Aspen set {P]„ at iteration n, the dimensionality of the 
search space is MxN where M is the number of parameters per 
Asp. The total number of parameters monotonically increases 
as a function of iteration number and stepl^becomes inefficient 
for complex images where the total number of Aspen can be 
several hundred. 

Figure0]shows the evolution of the scale of the few largest 
Aspen for a typical test case image. To demonstrate the evo- 
lution of the scales, an Asp once introduced, was kept in the 
problem for all successive iterations-. The top panel of Fig.|4] 
shows that after initial adjustment, the scale of most Aspen 

- In the spirit of searching for an image decomposition with a min- 
imum degree of freedom, it is desirable to develop heuristics to com- 
pletely remove Aspen which were significant to begin with but have 
been rendered insignificant due to the addition of other Aspen in suc- 
cessive cycles. Furthermore, heuristics to merge different but similar 
Aspen will further help in reducing the degrees of freedom. Work on 
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Fig. 2. Figure showing an example of Asp reconstruction of a typical astronomical image. Top left panel shows the HI image 
made with the VLA, used as the "true image" (/") for the simulation. The image contains ~ 10000 pixels with significant 
emission. This image was used to simulate visibilities corresponding to a VLA observation. The corresponding dirty image (/''), 
shown in the top right panel, was then deconvolved using the Asp-Clean algorithm. A 800- Asp component reconstructed model 
image (/*') is shown in the lower left panel. The lower right panel shows the restored Asp-model image (C/** + where C is 
the smoothing operator corresponding to the resolution element). 



did not change significantly. The parts of the curves in the top 
panel with small derivatives correspond to small step-sizes in 
the minimization algorithm. Clearly, dropping Aspen which are 
unlikely to change significantly will greatly improve the perfor- 
mance with minimal adverse effect on the value of the objective 
function. 



exploring these possibilities is in progress and will be reported in fu- 
ture papers. 



For a search space with constant curvature along all axis, 
the update step-size is proportional to the magnitude of the 
derivative along the various axis. Heuristically, the step-size 
for each Aspen in the minimization algorithm is proportional 
to the length of the derivative vector with respect to its param- 
eters {Lk = \'^kx\ where V*: corresponds to the derivative op- 
erator with respect to the parameters of the k^^ Asp only). This 
suggests a simple way of determining the active-set - i.e. by 
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Fig. 3. Figure shows the comparison of the residual images (B/*' - /'') using the Cotton-Schwab Clean (left), MS-Clean (center) 
and Asp-Clean (right) algorithms. The Asp model image used to compute the Asp-Clean residual was the same as shown in 
Fig. 12 The residuals for Asp-Clean are consistent with the noise and contains no correlated features at scales larger than the 
resolution element. The residuals in the other two images are correlated with the large scale emission and are comparable in 
magnitude to the peak noise in the off-source regions. The Cotton-Schwab Clean was run for 50 000 components, MS-Clean for 
20000 components and Asp-Clean for 800 components such that the peak residual was same for all cases. MS-Clean was run 
with 5 scales of sizes 0, 3, 5, 10 and 15 pixels. 



computing Lk at the beginning of each minimization cycle and 
dropping Aspen for the current cycle for which Lk is below a 
threshold Lg. Although this effectively assumes that the curva- 
ture along all axis is constant and is the same, and will result 
into some mistakes, one can recover from mistakes in later cy- 
cles since such a heuristic is applied at the beginning of each 
cycle. Assuming that the;t'^ surface is well approximated by a 
parabola close to - 0, its slope progressively decreases as 
convergence is approached. Lo therefore should also decrease 
as a function of convergence. The area under the residual image 
is indicative of the degree of convergence. Hence, a threshold 
of Lo = /1 2 /* was applied to Lk at the beginning of each cycle 
to determine the active-set. The value of A controls the size of 
the active-set and needs to be determined empirically based on 
the available computing power and the complexity of the image 
- larger its value, the smaller the size of the active-set. Effects of 
the value of A on the rate of convergence and the quality of re- 
construction has not yet been studied well. The lower panel of 
Fig-Sshows the evolution of the scale of the same set of Aspen 
as shown in the top panel - but after applying this heuristic. 
Only the Aspen indicated by symbols on these curves, consti- 
tute the active set at each iteration. Roughly speaking. Aspen 
which were not evolving were automatically dropped from the 
problem. The figure also shows that the Aspen were dynami- 
cally included in the problem if it was estimated that adjusting 
them would have a significant impact on the y^- Fig.|5]shows 
the number of Aspen along with the area under the residual im- 
age as a function of the iteration number For the test image 
used, the initial iterations correspond to the larger Aspen. As 
the process progressed towards convergence, these large Aspen 
settled down, i.e. small adjustments to them had a lesser im- 
pact on convergence, than the addition of weaker and smaller 



scaled Aspen. Consequently, the initial Aspen dropped out of 
the problem, and at later times, convergence was achieved by 
keeping only a few (latest) Aspen. Effectively, this achieved a 
dynamic control on the dimensionality of the search space. 

Strictly speaking, the active-set depends on the structure 
of the PSF and the inherent coupling of pixels in the image. 
Thresholding on the derivative vector and ignoring the curva- 
ture might result into dropping some axis with large curva- 
ture but small slope. Keeping them in the problem can poten- 
tially improve the rate of convergence. However that would re- 
quire computation of the full covariance matrix, which is pro- 
hibitively expensive. An efficient algorithm to determine the 
structure of the covariance matrix, even approximately, will be 
most useful. Work for testing some of our ideas for such an 
algorithm is in progress. 

7. Summary 

Deconvolution of images with a large number of pixels with 
significant emission is inherently a high dimensional problem. 
The computational load depends on the structure of the covari- 
ance matrix, which in turn depends on the form of parameteri- 
zation of the model image as well as on the structure of the PSF. 
It is possible to design efficient algorithms to decouple the PSF 
and the true image for images with a simple covariance struc- 
ture (diagonal or band-diagonal matrix). The Pixon method is 
suitable for filled aperture telescopes since the PSF has a lim- 
ited support making the covariance matrix strictly band diago- 
nal at the worst, and the problem can be broken up into smaller 
local problems. 

In addition to dealing with the inherent coupling of the pix- 
els in the true image, interferometric imaging involves com- 
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Fig. 4. Evolution of a few large Aspen as a function of itera- 
tions for a test image. For the figure in the top panel, each Asp 
is kept in the problem after it is introduced. It shows that the 
size of most Aspen settle down and does not change signifi- 
cantly at each iteration. Flat portions in these curves imply in- 
significant step size in the optimization iterations. Dropping the 
Aspen in iterations where they do not change, significantly re- 
duces computational cost with minimal adverse eff'ect on con- 
vergence. The lower panel shows the evolution of the scales, 
after applying a heuristic at the beginning of each iteration to 
drop the Aspen which are unUkely to change significantly. At 
each iteration, only the Aspen marked with a symbol in this 
plot were retained for optimization for that iteration. 

puting the coupling of the pixels due to the PSF, often through 
out the image. This has to be done for each trial model im- 
age (or its components) in search of the best model image. The 
Pixon method is therefore unsuitable for interferometric imag- 
ing. MS-Clean partly takes into account the inherent coupling 
of the pixels of the true image by representing it as a collection 
of components at a few scales. It ignores the coupling between 
these components but takes care of the couphng due to the PSF 
by pre-computing its effects. The advantage of this approach is 
that the computation of the coupling remains a scale-and-shift 
operation, which is efficient, while making the algorithm scale 
sensitive, albeit in a limited sense. The disadvantage is that it 
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Fig. 5. Figure showing the number of Aspen used along with 
the rate of convergence as a function of the iteration number. 
After initial rise in the number of Aspen, convergence was 
achieved by keeping only the latest few Aspen in the problem, 
effectively achieving a dynamic control on the dimensionality 
of the search space. 

uses more DOF than necessary which leads to similar prob- 
lems as in the CLEAN algorithm (that of breaking up of large 
scale emission). Non-symmetric structures are also difficult to 
accurately reconstruction. 

Asp-Clean attempts to deal with these problems by expUc- 
itly solving the problem in the hyper-space of the Aspen pa- 
rameters. The true image is modeled in the continuous space of 
Aspen. Since it is easy to parameterize the Aspen such that non- 
symmetric Aspen are also allowed, this algorithm deals with 
the problem of non-symmetric structures well, and results in 
a model image with the least DOF as compared to other al- 
gorithms for interferometric imaging. E.g., a purely elliptical 
gaussian shaped feature will be broken up into several symmet- 
ric components in the case of MS-CIean, while Asp-Clean can 
represent it with a single component. However the computation 
of the coupling is inefficient making the search for the param- 
eters expensive. Since not all parameters continue to change 
throughout the search process, we have implemented heuris- 
tics to determine the active-set of parameters and dynamically 
Umit the dimensionality of the search space. This improves the 
performance by an order of magnitude or more. However the 
overall runtime is still about a factor of three more than MS- 
Clean. Asp-Clean approach represents relaxation of compute- 
saving assumptions built into Clean and MS-Clean approaches. 
It is therefore inherently more compute intensive than other de- 
convolution algorithms. Although it is certainly useful to de- 
velop heuristics which will minimize redundant intermediate 
computations, the Asp decomposition is Umited by the com- 
pute load for the search for the Asp parameters which in turn 
scales strongly with the size of the active-set at each iteration. 
Since the size of the active-set of Aspen is roughly a measure of 
the degree to which the non-orthogonaUty of the Aspen space 
(coupling between the Aspen) is incorporated in the algorithm, 
it is fair to expect it to impact the rate of convergence signif- 
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icantly. Faster convergence and possibly better reconstruction 
can be achieved with more computing power Asp-Clean there- 
fore scales better with the CPU speed, compared to other scale 
sensitive algorithms, and therefore has the potential of benefit- 
ing more from the Moore's law of CPU speeds. 

Currently we are using the standard routines for the search 
algorithm (conjugate gradient metho d and its variants or the 
Levenberg-Marquardt algorithm; see Im. Galassi et^ ( l2002h 
for details). Fine tuning this minimization specifically for this 
problem will further improve the runtime performance. We use 
gaussians as the functional form for Asp. The scale of such 
Aspen is controlled by the full-width-at-half-maximum of the 
gaussian function. However there is no limit on the shape of 
the Aspen. More sophisticated functional forms which allow 
independent control on the shape as well the scale of the Asp 
will further reduce the number of components needed. Also, 
improving the heuristics to determine the active-set of Aspen, 
and making the heuristic computation itself efficient will be a 
worthwhile future direction of work. 

Finally, we note that most of the Aspen with scale signif- 
icantly larger than zero are found in the initial few hundred 
iterations, where the extra computational cost for Asp-Clean 
is well justified. The computational cost does not reduce for 
zero (or close to zero) scale Aspen. It may be most effective 
to develop heuristics to switch to other scale-less algorithms at 
late times which are more efficient for zero scale Aspen. Work 
in all these directions is in progress. Report on this on-going 
work and the impact of such algorithms on the imaging perfor- 
mance of future interferometric telescopes and imaging modes 
like mosaicking will be the subject matter of future papers. 

The Asp-Clean algorithm as described in this paper is im- 
plemented as a Glish client in AIPS++ and can be run via a 
Glish script. Work is underway to incorporate this as one of 
the many available deconvolution algorithms in the standard 
AIPS-n- interferometric imaging tool. 
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